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ABSTRACT. The reconstruction of three-dimensional sparse volume functions from few tomo- 
graphic projections constitutes a challenging problem in image reconstruction and turns out to 
be a particular instance problem of compressive sensing. The tomographic measurement matrix 
encodes the incidence relation of the imaging process, and therefore is not subject to design up 
to small perturbations of non-zero entries. We present an average case analysis of the recovery 
properties and a corresponding tail bound to establish weak thresholds, in excellent agreement 
with numerical experiments. Our result improve the state-of-the-art of tomographic imaging in 
experimental fluid dynamics by a factor of three. 



1. Introduction 

Research on compressive sensing j8l [3l focuses on properties of underdetermined linear 
systems 

Ax = b, AeR mxn , m<n, (1.1) 

that ensure the accurate recovery of sparse solutions x from observed measurements b. Strong 
assertions are based on random ensembles of measurement matrices A and measure concen- 
tration in high dimensions that enable to prove good recovery properties with high probability 

A common obstacle in various application fields are the limited options for designing a mea- 
surement matrix so as to exhibit desirable mathematical properties, are very limited. Accord- 
ingly, recent research has also been concerned with more restricted scenarios, spurred by their 



relevancy to applications (cf. Section 2.3). 



Consequently, we consider a representative scenario, motivated by applications in experi- 
mental fluid dynamics (Fig.[T]). A suitable mathematical abstraction of this setup gives rise to 
a huge and severely underdetermined linear system < | 1 . 1 1 ) that has additional properties: a very 
sparse nonnegative measurement matrix A with constant small support of all column vectors, 
and a nonnegative sparse solution vector x: 

A>0, x>0, swpp(A. tj ) = i < m, Wj = 1, . . . , n. (1.2) 

Our objective is the usual one: relating accurate recovery of x from given measurements b 
to the sparsity k = supp(x) of the solution x and to the dimensions m, n of the measurement 
matrix A. The sparsity parameter k has an immediate physical interpretation (Fig. [I}. Engineers 
require high values of k, but are well aware that too high values lead to spurious solutions. The 
current practice is based on a rule of thumb leading to conservative low values of k. 

In this paper, we are concerned with working out a better compromise along with a mathe- 
matical underpinning. The techniques employed are general and only specific to the class of 
linear systems ( |1.1[ ), ( |1.2| ), rather than to a particular application domain. 
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Figure 1. Compressive sensing in experimental fluid dynamics: A multi- 
camera setup gathers few projections from a sparse volume function. This sce- 
nario is described by a very large and highly underdetermined sparse linear 
system ( |1.1[ ) having the additional properties ( L2) ). The sparsity parameter k 
reflects the seeding density of a given fluid with particles. Less sparse scenarios 
increase the spatial resolution of subsequent studies of turbulent motions, but 
compromise accuracy of the reconstruction. Research is concerned with work- 
ing out and mathematically substantiating the best compromise. 



We regard the measurement matrix A as given. Concerning the design of A, we can only 
resort to small random perturbations of the non-zero entries of A, thus preserving the sparse 
structure that encodes the underlying incidence relation of the sensor. Additionally, we exploit 
the fact that solution vectors x can be regarded as samples from a uniform distribution over 
fc-sparse vectors, which represents with sufficient accuracy the underlying physical situation. 

Under these assumptions, we focus on an average case analysis of conditions under which 
unique recovery of x can be expected with high probability. A corresponding tail bound implies 
a weak threshold effect and criterion for adequately choosing the value of the sparsity parameter 
k. Our results are in excellent agreement with numerical experiments and improve the state-of- 
the-art by a factor of three. 

Contribution and Organization. In Section [2j we detail the mathematical abstraction of the 
imaging process and discuss directly related work. In Section [3j we examine recent results of 
compressive sensing based on sparse expanders. This sets the stage for an average case analysis 
conducted in Section [5] and corresponding weak recovery properties, that are in sharp contrast 
to poor strong recovery properties presented in Section |4j We conclude with a discussion of 
quantitative results and their agreement with numerical experiments in Section [6} 

Notation. \X\ denotes the cardinality of a finite set X and [n] = {1,2, ... ,n} for n E N. 
We will denote by ||x|| = x { ^ 0}| and R% = {x E W n : \\x\\ < k} the set of k- 
sparse vectors. The corresponding sets of non-negative vectors are denoted by IR™ and 
respectively. The support of a vector x E W 1 , supp(s) C [n], is the set of indices of non- 
vanishing components of x. With I + (x) = {i: Xj > 0}, I°(x) = {i: Xi = 0} and I~(x) = 
{i: Xi < 0}, we have supp(a;) = I + (x) U I~(x) and ||x|| = |supp(o;)|. 

For a finite set S, the set Af(S) denotes the union of all neighbors of elements of S where 
the corresponding relation (graph) will be clear from the context. 

1 = (1,...,1) T denotes the one-vector of appropriate dimension. 

A, t i denotes the i-th column vector of a matrix A. For given index sets /, J, matrix Ajj 
denotes the submatrix of A with rows and columns indexed by / and J, respectively. I c , J c 
denote the respective complement sets. Similarly, bj denotes a subvector of b. 

E[-] denotes the expectation operation applied to a random variable and Pr(A) the probability 
to observe an event A. 
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2. Preliminaries 

2.1. Imaging Setup and Representation. We refer to Figure [2] for an illustration of the math- 
ematical abstraction of the scenario depicted by Figure [TJ In order to handle in parallel the 2D 
and 3D cases, we will use the variable 

De{2,3}. (2.1) 

We measure the problem size in terms of d £ N and consider n := dP cells in a square 
(D = 2) or cube {D = 3) and m := Dd D ~ x rays, compare Fig. [2j left and right. It will be 
useful to denote the set of cells by C = [n] and the set of rays by R = [m). The incidence 
relation between cells and rays is given byamxn measurement matrix 

(A D ) = { if J " th my intersects iAh cel1 ' (2 2) 

d %3 [ 0, otherwise, 

for all i G [m], j G [n]. Thus, cells and rays correspond to columns and rows of A®. 

The incidence relation encoded by A® gives rise to the equivalent representation in terms of 
a bipartite graph G = (C, R; E) with left and right vertices C and R, and edges cr G £ iff 
(A® ) rc = 1. Figure [2] illustrates that G has constant left-degree £ = D. It will be convenient 
to use a separate symbol L 

For a fixed vertex i, any adjacent vertex j ~ 2 is called neighbor of i. For any non-negative 
measurement matrix A and the corresponding graph, the set 

AT(S) = {ie [m]: i ~ j, j'GS} = {i6 [m]: ^ > 0, j G 5} 

contains all neighbors of S 1 . The same notation applies to neighbors of subsets S C [m] of right 
nodes. 

With slight abuse, we call the matrix A® that encodes the adjacency r ~ c of vertices 
r E R and c 6 C adjacency matrix of the induced bipartite graph G, deviating from the 
usual definition of the adjacency matrix of a graph that encodes the adjacency of all nodes 
Vi ~ Vj, V — C U R. Moreover, in this sense, we will call any non-negative matrix adjacency 
matrix, based on its non-zero entries. 

Let A be the non-negative adjacency matrix of a bipartite graph with constant left degree 1. 
The perturbed matrix A is computed by uniformly perturbing the non-zero entries Aij > 
to obtain Ay G [A^ — e, A^ + e], and by normalizing subsequently all column vectors of A. 
In practice, such perturbation can be implemented by discretizing the image by radial basis 
functions and choose their locations on an irregular grid, see lfl4l . 

The following class of graphs plays a key role in the present context and in the field of 
compressed sensing in general. 

Definition 2.1. A (v, 5)-unbalanced expander is a bipartite simple graph G = (L, R; E) with 
constant left-degree i such that for any X C L with \X\ < u, the set of neighbors Af(X) C R 
of X has at least size \Af(X)\ > 5£\X\. 

2.2. Deviation Bound. We will apply the following inequalities for bounding the deviation of 
a random variable from its expected value based on martingales, that is on sequences of random 
variables (Xi) defined on a finite probability space (fi, J 7 , fi) satisfying 

EfXi+il^] =Xi, for all i > 1, (2.3) 

where T% denotes an increasing sequence of cx-fields in T with X{ being J^-measurable. 

This setting applies to random variables associated to measurements that are statistically 
dependent due to the intersection of projection rays (cf. Fig. [2]). 



4 



PETRA, SCHNORR 







1 


I 










































































Aid 




















Figure 2. Left: 2D imaging geometry with d 2 cells and 2c? projection rays 
(here: d = 6). The incidence relation is given by the measurement matrix 
A = A 2 , (cf . Eqn. ( |2.2[ )) which is the adjacency of a bipartite graph with constant 
left degree £ = 2. Right: 3D imaging geometry with d 3 cells and 3c? 2 rays (here: 
d = 7). The incidence relation given by the measurement matrix A = A 3 d is the 
adjacency of a bipartite graph with constant left degree t = 3. 



Theorem 2.1 (Azuma's Inequality Q3[6]|)> Let pQ)j = o,i,2,... be a sequence of random variables 
such that for each i, 

{Xi-X^tl^a. (2.4) 

Then, for all j > and any 5 > 0, 

Pr - X | > 5) < 2exp ( - V (2.5) 

2.3. Related Work. Although it was shown [O that random measurement matrices are op- 
timal for Compressive Sensing, in the sense that they require a minimal number of samples 
to recover efficiently a fc-sparse vector, recent trends [|2l |20l tend to replace random dense 
matrices by adjacency matrices of "high quality" expander graphs. Explicit constructions of 
such expanders exist, but are quite involved. However, random m x n binary matrices with 
nonreplicative columns that have [£n\ entries equal to 1, perform numerically extremely well, 
even if i is small, as shown in [2]. In [|T2l it is shown that perturbing the elements of adjacency 
matrices of expander graphs with low expansion, can also improve performance. This findings 
complement our prior work in lfT4l . where we observed that by slightly perturbing the entries of 
a tomographic projection matrix its reconstruction performance can be improved significantly. 

We wish to inspect the bounds on the required sparsity that guarantee exact reconstruction 
of most sparse signals, and corresponding critical parameter values similar to weak thresholds 
in ifTOlfTTfl . The authors have computed sharp reconstruction thresholds for Gaussian measure- 
ments, such that for given a signal length n and numbers of measurements m, the maximal 
sparsity value k which guarantees perfect reconstruction can be determined precisely. 

For a matrix A 6 W mxn , Donoho and Tanner define the undersampling ratio 5 — — G (0, 1) 
and the sparsity as a fraction of m, k = pm, for p e (0, 1). The so called strong phase transition 
Ps(S) indicates the necessary undersampling ratio S to recover all A;-sparse solutions, while the 
weak phase transition pw($) indicates when x* with \\x* || < Pw($) • m can be recovered with 
overwhelming probability by linear programming. 

Relevant for TomoPIV is the setting as 5 — > and n — > oo, that is severe undersampling, 
since the number of measurements is of order O(10 4 ) and discretization of the volume can 
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be made accordingly fine. For Gaussian ensembles a strong asymptotic threshold ps(S) ~ 
(2elog(l/5) _1 and weak asymptotic threshold pw{$) ~ (2 log(l/<5) _1 holds, see e.g. ifTOll . 
In this highly undersampled regime, the asymptotic thresholds are the same for nonnegative 
and unsigned signals. Exact sparse recovery of nonnegative vectors has been also studied in a 
series of recent papers IPT21IT81 , while [[T5l[T6l additionally assumes that all nonzero elements 
are equal to each other. As expected, additional information, improves the recoverable sparsity 
thresholds. 

2.3.1. Strong Recovery. The maximal sparsity k depending on m and n, such that all sparse 
signals are unique and coincide with the unique positive solution of Ax = b, is investigated 
in |[T0l [TTTl from the perspective of convex geometry by studying the face lattice of the convex 
polytope convjA. !, . . . , A, _ n , 0}. It is related to the nullspace property for nonnegative signals 
in what follows. 

Theorem 2.2 (OH [H [TBI (Hi). Let A G R mxn be an arbitrary matrix. Then the following 
statements are equivalent: 

(a) Every k-sparse nonnegative vector x* is the unique positive solution of Ax = Ax*. 

(b) The convex polytope defined as the convex hull of the columns in A and the zero vector, 
i.e. conv{A. i, . . . , A m>n , 0} is outwardly k-neighborly. 

(c) Every nonzero null space vector has at least k + 1 negative (and positive) entries. 

2.3.2. Weak Recovery. Thm. 2 in |[T0l shows the equivalence between (k, e)-weakly (out- 
wardly) neighborliness and weak recovery, i.e. uniqueness of all except a fraction e of fc-sparse 
nonnegative vectors. Weak neighborliness is the same thing as saying that AAq -1 has at least 
(1 — e)-times as many (k — 1) -faces as the simplex Aq _1 . A different form of weak recovery is 
to determine the probability that a random A;-sparse positive vector by probabilistic nullspace 
analysis. This concepts are related for an arbitrary sparse vector with exactly k nonnegative 
entries in the next theorem. 

Theorem 2.3. Let A G jjjmx™ ^ an ar jy^ rar y ma trix. Then the following statements are 
equivalent: 

(a) The k-sparse nonnegative vector x* supported on S, \S\ = k, is the unique positive 
solution of Ax = Ax*. 

(b) Every nonzero null space vector cannot have all its negative components in S. 

(c) AsIR^ is a k-face ofAW^, i.e. there exists a hyperplane separating the cone generated 
by the linearly independent columns {A,j}j e sffom the cone generated by the columns 
of the off-support {A,j} jeS c 

Proof. Statement (a) holds if and only if there is no v ^ such that Av = and vs? > 0, 
compare for e.g. O Thm. 1]. Thus (a) <^> (b). By EH Lem. 5.1], (a) <^> (c) holds as well. □ 

If, in addition, all k nonzero entries are equal to each other, then a stronger characterization 
holds. 

Theorem 2.4 flfT3l Prop. 2]). Let A e fl£ mxn ^ e an ar bif rar y ma trix. Then the following 
statements are equivalent: 

(a) The k-sparse binary vector x* G {0, l} n supported on S, \S\ = k, is the unique solution 
of Ax = Ax* with x G [0, If . 

(b) Every nonzero null space vector cannot have all its negative components in S and the 
positive ones in S c . 

(c) There exists a vector r such that Diag(,2*)/l T r > 0, with z* := e — 2x*. 

(d) G M m is not contained in the convex hull of the columns of ADi&g(z*), i.e. ^ 
conv{z*A, A , . . . , z* n A, tTl , 0}, with z* := e - 2x*. 
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Proof. If x* is unique in {0, l} n , it is unique in [0, l] n as well. Uniqueness in [0, l] n holds, for 
e.g. by lfl"3l Thm. 1], if there is no v ^ such that Av = 0, i>gc > and v$ < 0, which shows 
equivalence to (b). With D := Diag(e — 2x*) and DD = I, (b) can be rewritten as follows: 
there is no v ^ such that ADDv = 0, Dv > 0, Dv ^ 0. With u := Dv, the above condition 
becomes: 

ADu = 0, u > 0, u 7^ , has no solution , 

which by Gordon's theorem of alternative gives the equivalent certificate (c): 

3r such that DA T r > . (2.6) 

In other words, a small A;-subset of the columns of A, are "flipped" by multiplication with — 1, 
and these modified columns together with all remaining ones can be separated from the origin, 
which shows equivalence to (d), i.e. is not contained in the convex hull of these points. □ 

Note that statement (d) is related to the necessary condition for uniqueness in lfT8l Thm. 1]. 



We further comment on Thm. 2.4 (c) from a probabilistic viewpoint. Condition (c) says that all 
points defined by the columns of A Diag(e — 2x*) are located in a single half space defined by a 
hyperplane through the origin with normal r. Conditions under which this is likely to hold were 
studied by Wendel Ifl9l . This problem is also directly related to the basic pattern recognition 
problem concerning the linear classificatiorj^of any dichotomy of a finite point set 0. 

Assuming n points in IR m to be in general position, that is any subset of m vectors is lin- 
early independent, and that the distribution from which the given point set is regarded as an 



i.i.d. sample set is symmetric with respect to the origin, then condition ( |2.6[ ) holds with proba- 
bility 

As Figure [3] illustrates, Pr(n, m) = 1 if n/m < 1, due to the well known fact that any di- 
chotomy of m + 1 points in M. m can be separated by a hyper-plane |[P71 0. For increasing 
dimension m — > oo, this also holds almost surely if n/m < 2, which can be easily deduced 
by applying a binomial tail bound. Accordingly, assuming that the measurement matrix A con- 
forms to the assumptions, the authors of [!13j conclude that an existing binary solution to ( |1.1[ ) 



is unique with probability ( |2.7[ ) for underdetermined systems with ratio m/n > 1/2. 



We adopt this viewpoint in Section 5.3 and develop a criterion for unique recovery with high 
probability using the given measurement matrix ( |2.2| ), based on a probabilistic average case 
analysis of condition ( |3.9[ ) (Section |5TTj). This criterion currently characterizes best the design 



of tomographic scenarios (Fig. [2]), with recovery performance guaranteed with high probability. 
We conclude this section by mentioning that exact nonasymptotic recovery results for a /c-sparse 
nonnegative vector are obtained in [|TTl Thm. 1.10] by exploiting Wendel's theorem. Donoho 
and Tanner show that the probability of uniqueness of a A;-sparse nonnegative vector equals 
Pr(n — m,n — k), provided A satisfies certain conditions which do not hold in our considered 
application. 

3. Expanders, Perturbation, and Weak Recovery 

This section collects recent results of recovery properties based on expanders associated with 
sparse measurement matrices, possibly after a random perturbation of the non-zero matrix en- 



tries. Section 3.3 applies these results to our specific setting in a form suitable for a probabilistic 



analysis of recovery performance presented in Section |5j 



'in this context, "linear" means affine decision functions. 
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Figure 3. The probability Pr(n, to) given by ( |2.7[ ) that n points in gen- 
eral position in IR m can be linearly separated [|T9ll . This holds with probability 

Pr(n, to) = 1 for n/m < 1, and with Pr(n, to) — ► 1 if to — >■ oo and 1 < n/m < 
2. 



3.1. Expanders and Recovery. The following theorem is a slight variation of Theorem 4 in 
lfT8ll tailored to our specific setting. 

Theorem 3.1. Let A be the adjacency matrix of a {y, 5) -unbalanced expander and 1 > 5 > 
2 1 - Then for any k-sparse vector x* with k < r^gy the solution set {x : Ax = Ax*, x > 0} 
is a singleton. 

Proof. We will show that every nonzero null space vector has at least + 1 negative and 
positive entries. Then Theorem Z2]will provide the desired assertion. 

Suppose without loss of generality that there is a vector v G ker(A) \ {0} with 

8 = '''Ml £ (TTs) ' 

Then 

i\r(v)\>\Af(r(v))\>6£s, (3.2) 

where the second inequality follows by assumption due to the expansion property. 
Denoting by S the support of v, S = I~ (v) U I + {v), we have 

U{I-{v))=U{I + {v))=U{S), (3.3) 

since otherwise Av ^ because A is non-negative. 

From £\I + (v)\ > \N(I+(v))\, ^3} and we obtain 

\I + (v)\>5s. (3.4) 

Thus, 

|5| = \r(v)\ + \I + (v)\ > 25s > (1 + 8)s. (3.5) 
Let S C S such that \S\ = [(5 + l)sj . Thus |5| < i/ and 

|A^(5)| > > + l)s > si (3.6) 

provided 5(1 + 5) > 1 <£> 6 > (y/E - l)/2. Summarizing, we get s£ < |7V(S)| < |jV(S)| = 
\J\f(I~ (v))\ < si, hence a contradiction. □ 



The assertion of Theorem 3.1 solely relies on the expansion property of the measurement 
matrix A. Theorem [34] below will be based on it and in turn the results of Section I3T2] 
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3.2. Perturbed Expanders and Recovery. We describe next an alternative route based on the 
complete (Kruskal) rank r = r (A) of a measurement matrix A. This is the maximal integer 
r such that every subset of r columns of A is linearly independent. 

While this number is combinatorially difficult to compute in practice, both the number and 
the corresponding recovery performance can be enhanced by relating it to a particular expansion 
property of the bipartite graph associated to a perturbed measurement matrix A. The latter can 
be easily computed in practice while preserving its sparsity, i.e. the constant left-degree £. 

Theorem 3.2 ( lfl4l Thm. 6.2], lfT2l Thm. 4.1]). Let Abe a non-negative matrix with £ non-zero 
entries in each column and complete rank r$ = tq(A). Then \I~{y)\ > r^jifor all nullspace 
vectors v G ker(A). 



Remark 3.1. In view of Theorem 2.2, (c), Theorem |3.2 says that all /c-sparse non-negative 



vectors x can be uniquely recovered if k < \r /£ — 1]. 

The following Lemma asserts that by a perturbation of the measurement matrix the complete 
rank, and hence the recovery property, may be enhanced provided all subsets of columns, up to 
a related cardinality, entail an expansion that is less however than the one required by Theorem 

EU 

Lemma 3.3 ([12, Lemma 4.2]). Let Abe a non-negative matrix with £ non-zero entries in each 
column. Suppose that for a submatrix formed by r columns of A it holds that \J\f(X)\ > \X\, 
for each subset X C C of columns of cardinality \C\ < fo, and with respect to the bipartite 
graph induced by A. Then there exists a perturbed matrix A that has the same structure as A 
such that its complete rank satisfies tq(A) > fo. 

Theorem 13.51 below and Section [531 will be based on Theorem 13 .21 and Lemma l33l 

3.3. Weak Reconstruction Guarantees. We introduce some further notions used subse- 
quently to state our results. Let A denote the matrix A® defined by ( |2.2[ ), and consider a 



subset X C C of |X| = k columns and a corresponding A;-sparse vector x. Then b = Ax 
has support Af(x), and we may remove the subset of J\f(X) c = (J\f(X)) c rows from the linear 
system Ax = b corresponding to b r = 0, Vr G R. Moreover, based on the observation J\f(X), 
we know that 

XCAf(Af(X)) and Af(Af(X) c ) n X = . (3.7) 
Consequently, we can restrict the linear system Ax = b to the subset of columns M(M(X)) \ 



M(Af(X) c ) C C. This will be detailed below by Proposition 5.1 



In practical applications, the reconstruction of a random A;-sparse vector x will be based on 
a reduced linear system with the above dimensions. These dimensions will be the same for all 
random sets X = supp(x) contained in J\f(J\f(X)). Consequently, in view of a probabilistic 
average case analysis conducted in Section [5J it suffices to measure the expansion with respect 
to these sets. 



Taking this into account, the following theorem tailors Theorem 3.1 to our specific setting 



Theorem 3.4. Let A be the adjacency matrix of a bipartite graph such that for all random 
subsets X C C of\X\ < k left nodes, the set of neighbors N (X) of X satisfies 

\N{X)\ > 5£\H{M{X))\M{M{X) C )\ with 5>^^. (3.8) 
Then, for any k-sparse vector x*, the solution set {x : Ax = Ax*, x > 0} is a singleton. 



Likewise, the following theorem applies the statements of Section 3.2 to our specific setting. 
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Theorem 3.5. Let A be the adjacency matrix of a bipartite graph such that for all subsets 
X C C of\X\ < k left nodes, the set of neighbors J\f(X) of X satisfies 

\N(X)\>5l\N{N{X))\N{N{X) c )\ with 8>- (3.9) 

Then, for any k-sparse vector x*, there exists a perturbation A of A such that the solution set 
{x : Ax = Ax*, x > 0} is a singleton. 

The consequences of Theorems |3.4 and 3.5 are investigated in Section [5] by working out 
critical values of the sparsity parameter k for which the respective conditions are satisfied with 
high probability. 



4. Strong Equivalence 

In [[141 we tested the properties of the discrete tomography matrix in focus against various 
conditions, like the null space property, the restricted isometry property, etc., and predicted an 
extremely poor worst case performance of such a measurement system. In the 3D case we 
showed that the strong threshold on sparsity, that is the maximal sparsity level k for which 
recovery of all fc-sparse (positive) vectors, k < k , is guaranteed, is a constant, not depending 
on the undersampling ratio d. 



4. 1 . Unperturbed Systems. Given an indexing of cells and rays, we can rewrite the projection 



matrix A® G 



)Dd D ~ 



from (|2.2|) in closed form as 

Id 



A 



D 



t T d ®Id 



/II 



V 



Id®Id\ 
id ® lj ® Id 
id <g) id ®lj / 



if D 



if D 



(4.1) 



Since for this matrices a sparse nullspace basis can be computed, we can derive the maximal 
sparsity via the nullspace property, as shown next. 



Proposition 4.1. d Prop. 2.2, Prop. 3.2] Let D e {2, 3}, d E N, d > 3 and A^ from ([47T) 

nd D x(d-l) D 



Define B? G 



as 



B 



D 









id-1 J 




\ Id-i 
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Id-1 J 


V h-1 



id-1 



ifD = 2, 
ifD = 3 



(4.2) 



Then the following statements hold 
(a) A D d B D d = 0. 



(b) Every column in B® has exactly 2 D nonzero (2 D_1 positive, 2 D ^ X negative) elements. 

(c) is a full rank matrix and rank(_B| ) ) — (d— 1) D . 

(d) ker(A d 3 ) = span{B d }, i.e. the columns of B® provide a basis for the null space of 

A D 
n d ■ 

(e) rank(Af ) = d D - (d - l) D . 

(f) YTi=i v i = ® holds for all v G ker(Af ). 
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Figure 4. Two different non unique 4-sparse "particle" distributions in a 
3x3x3 volume. Both configurations (represented by black and white dots) 
yield identical projections in all three directions. Such nonunique configurations 
correspond to positive or negative entries in an 8-sparse nullspace vector of A% 
compare to Prop. |4T 



(g) The Kruskal rang of A® is 2° — 1, i.e. 



mm 

uSker(Aj) 



Mo 



if 



(h) Every nonzero nullspace vector has at least 2 negative entries, i.e. 



mm 

veker(Af) 



>£>-! 



Thus, (g) and (h) imply 

Corollary 4.2. For all d 

sparsest solution of A® x 



N, d 
A D d x* 



> 3, every (2 D 1 — \)-sparse vector x* is the unique 
Moreover, for every (2 D ~ 1 — \)-sparse positive vector 



A°x* 



x* {x: A®x = A®x*} is a singleton. 

This bound is tight, since we can construct two 2 D_1 -sparse solutions x 1 and x 2 such that 
A®x 2 , compare Fig. [4] for the 3D case. However, when D=3, not every 8-column 
combination, or more, in A d is linearly dependent. In fact, only a limited number of /c-column 
combinations can be dependent without violating rank(A^) = 3c? 2 — 3d + 1. It turns out that 
this number is tiny for smaller k when compared to (?). As k increases this number also 
grows and equals 1 only when k > rank(^). Likewise, not every 4-sparse binary vector is 
nonunique. Due to the simple geometry of the problem it is not difficult to count the "bad" 
4-sparse configurations in 3D. Since they are always located in 4 out of 8 corners of a cuboid 
in the d 3 cube, compare Fig. [4] left, and there are only two possibilities to choose them, the 
probability that a 4-sparse binary vector is unique, equals 

2 © 3 - ! _ , y-i>'_ = 1 _ 0(rf -«) / - 



l 



(?) 



(d 2 + d+l)(d 3 -2)(# -3) 



1 



-> 1 



4.2. Perturbed Systems. The weak performance of A d rests upon its small Kruskal rank. In 
order to increase the maximal number k of columns such that all k (or less) column combina- 
tions are linearly independent we perturb the nonzero entries of the original matrix A®. Figure 
[5J right, indicates that perturbation leads to less sparse nullspace vectors. If we could estimate 
the Kruskal rank f of the perturbed system we could apply Thm. 3.2 and obtain a lower bound 
on the sparsity yielding strong recovery for all \f /£ — 1] -sparse vectors. However, determin- 
ing f for the perturbed matrix seems impossible. We believe however that it increases with d, 
in contrast to the constant 2 D — 1 in case of unperturbed systems. Luckily, it will turn out in 



Section 5.2 that the weak recovery threshold for unperturbed systems will give a lower bound 
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Figure 5. Left: The 3D projection matrix A\ and, middle, a sparse basis 
which spans its nullspace. Right: If we allow a small perturbation of the 
nonzero entries of Af , all corresponding nullspace vectors of the perturbed ma- 
trix will be less sparse and lie in a d ^ 1 (d— D) -dimensional subspace, compared 
to (d — 1) D in the unperturbed case. 

on the strong recovery threshold for perturbed matrices, since reduced systems will be strictly 
overdetermined and guaranteed to have full rank. 



5. Weak Recovery 

In this section, we consider the recovery properties of the 3D setup depicted in Fig. [2] and 
establish conditions for weak recovery, that is conditions for unique recovery that holds on 
average with high probability. We clearly point out that our conditions do not guarantee unique 
recovery in each concrete problem instance. 

Remark 5.1. In what follows, the phrase with high probability refers to values of the sparsity 
parameter k for which random supports |supp(6)| concentrate around the crucial expected 



value Nr according to Prop. 5.3 thus yielding a desired threshold effect. 



We first inspect in Section [5J~ the effect of sparsity on the expected dimensions of a reduced 
system of linear equations, along with its equivalence to the original system. Subsequently, we 



establish the aforementioned conditions based on Theorems 3.4 and 3.5 and on the expected 
quantities involved in the corresponding conditions. 

In particular, we establish such uniqueness conditions for reduced underdetermined systems 
of dimension m/n > (\/5 — l)/2 ~ 0.618. Our results are in excellent agreement with 
numerical experiments discussed in Section [6} 



5.1. Reduced System. We formalize the system reduction described in Eqn. ( |3.7| ). Besides 
checking its equivalence to the unreduced system, we compute the expected reduced dimen- 
sions together with a deviation bound. Additionally, we determine critical values of the sparsity 
parameter k that lead to overdetermined reduced systems. 



Recall from Section 2.1 that we regard a given measurement matrix A also as adjacency 
matrix of a bipartite graph G = (C, R; E). 

5.1.1. Definition and Equivalence. 

Definition 5.1. The reduced system corresponding to a given non-negative vector b, 

A ™ I. AC J\$ m red xn red 

' tx red' L — u redy -^red t IKi + ) 



(5.1) 
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results from A, b by choosing the subsets of rows and columns 

R b := supp(6), C b := N(R b ) \N(R c b ) (5.2) 

with 

m red '■= \Rb\, n red '■= | Cfo | - (5.3) 

Note that for a vector x and the bipartite graph induced by the measurement matrix A, we 
have the correspondence (cf. p.7[ )) 

X = supp(x), R b = Af(X), C b = Af(Af(X)) \ Af(Af(X) c ). 

We further define 

S + := {x: Ax = b,x > 0} (5.4) 

and 

Sted ■= i x : A R b c b x = b Rb , x > 0} . (5.5) 

The following proposition asserts that solving the reduced system ( |5.1[ ) will always recover the 
support of the solution to the original system Ax = b. 

Proposition 5.1. Let A G ]^ mxn an d \, g ]j m have nonnegative entries only, and let S + and 
Sy ed be defined by ( |5.4[ ) and ( |5.5[ ), respectively. Then 

S + = {xGl": x [Cb y = and x Cb G «S+ d }. (5.6) 

Proo/ Let S := {x 6 R": x (Cfc)c = and x Cb G <S+ d }. We first show 5 C <S+. Let ig5. 
From this x > follows directly. We thus just have to show Y^j=i a ij x j = ^ e W- Indeed, 
for 

n 

^ G R b . ^^^^ CL^jX j — ^^^^ Chj^jXj I ^^^^ ^"ij X j — 5 

whereas for 

n 

i G {R b ) c : '^dijXj = 22 ai i x i + ^2 aij Xj = ® = ^ ' 
Now let x G S + and consider any i G (i?fe) c . Then 

— — ^^^^ CL^jXj — ^^^^ (X^2 X j I ^^^^ X j (5*7) 

J =1 ie^^T ieW^ 



holds. Since x > 0, we obtain from ( |5.7[ ) that Xj = 0, Vj G (C fe ) c . To show that A RbCb x Cb = 
b Rb , consider 

n 

i E R b : 

ieCj jec b je(c b y j=i 

Hence, X( Cb y = and xc 6 G <S+ d . Thus x G S. □ 

In the following two sections, we compute the expected values of the reduced system dimen- 
sion (15. 31). 
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5. 1 .2. Expected Number of Non-Zero Measurements. We consider the uniform random assign- 
ment of k particles to the n = \C\ cells c G C. A single cell may be occupied by more than 
a single particle. This corresponds to the physical situation that real particles are very small 
relative to the discretization depicted by Figure [2] The imaging optics enlarges the appear- 
ance of particles, and the action of physical projection rays is adequately represented by linear 
superposition. 

This scenario gives rise to a random vector x 6 R£ + with support |supp(x)| < k. It 
generates a vector 

b = A D d x G (5.8) 
of measurements. We are interested in the expected size of the support of b, 

iV fl :=E[|supp(6)|], N° R :=m-N R , (5.9) 

that equals the number of projection rays r 6 R with non- vanishing measurements b r ^ 0. We 
denote the event b r = by the binary random variable[^X r = 1, i.e. X r = corresponds to the 
event b r > that at least a single particle meets ray r. 
The probability that a single c is met by ray r is 



d d 



^' \C\ n d D ~ x ' 
For k particles, the probability that < i < k particles meet projection ray r is 



(5.10) 



Pr(&, = i) = Q<?>!T\ Pd := 1 - Qd- (5-11) 

Consequently, we have 

Pr[X r = 1] = E[X r ] = p k d , (5.12a) 

Pr[ X r = 0] = J2 (fjtiPd- 1 = 1 ~Pd- (5- 12b) 



N R = N R {k) = \R\(l - p k d ) = Dd D - 1 (l - (l 



Lemma 5.2. The expected number of non-zero measurements defined by ( |5.9| ) is 

1 \fc N 

K = K(k) = \R\ -N R = \R\p k d = Dd D ~ l (l - J-j 
Proof. Due to the linearity of expectation, summing over all rays gives 

N R = E[j2a-X r )] =\R\(l-p k d ). 



(5.13) 



□ 



Remark 5.2. Note that A^^ specifies the expected value of m re d in ( |5.3[ ) induced by random 
/c-sparse vectors x G See Figure [6] for an illustration. 



We economize notation here by re-using the symbol X, a random indicator vector indexed by rays (right 
nodes) r E R. Due to the context, there should be no danger of confusion with X = supp(x) denoting random 
subsets of left nodes used in other sections. 
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Figure 6. The expected number N R of non-zero measurements (5.12). For 



highly sparse scenarios (small k), the expected support ( |5.9[ ) of the measurement 
vector | supp(6)| ~ 3k. For large values of k, this rate decreases due to the 
multiple incidence of cells and projection rays. 



Bounding the Deviation of N^. We are interested in how sharply the random number X = 
J2reR °f zer ° measurements peaks around its expected value = E[X] given by ( |5.13| ). 
We derive next a corresponding tail bound by regarding a sequence of k randomly located 
cells and by bounding the difference of subsequent conditional expected values of the random 



variable X. Theorem 2.1 then provides a bound for the deviation \X — K[X] |. 

Let the set of rays R represent the elementary events corresponding to the observations X r 
1 or X r = for each ray r e R, i.e. ray r corresponds to a zero measurement or not. 



Let Ti C 2 1 



0, 1, 2, ... , denote the er-field generated by the collection of subsets of R 



that correspond to all possible events after having observed % randomly selected cells. We set 
JFq = {0, R}. Because observing cell i + 1 just further partitions the current state based on 
the previously observed i cells by possibly removing some ray (or rays) from the set of zero 
measurements, we have a nested sequence (filtration) JF Q C T\ C • • • C JF k of the set 2 R of all 
subsets of R. 

Based on this, for a fixed value of the sparsity parameter k, we define the sequence of random 
variables 

Yi = E[X\Ti\, 1 = 0,1,...,*;, (5.14) 

where Yi, i = 0, 1, . . . , k — 1, are the random variables specifying the expected number of 
zero measurements after having observed k randomly selected cells, conditioned on the subset 
of events Ti determined by the observation of % randomly selected cells. Consequently, Y = 
E[X] = Nfi due to the absence of any information, and Y k = X is just the observed number 
of zero measurements. The sequence (li)i=o,...,fc is a martingale by construction satisfying 
Ep^+il^] = Y h that is condition ( |2~3] >. 

Proposition 5.3. Let = K[X] be the expected number of zero measurements for a given 
sparsity parameter k, given by ( |5.13[ ). Then, for any 8 > 0, 

Pr(|X-iV°| >6) < 2exp^ ^ ^ 

(5.15) 



/* 2 exp ( 



;i - P f) 2D 2 

5 2 



2D 2 k 



if d — > oo. 



This result shows that for large problem sizes d occurring in applications, concentration 
of observations of primarily depends on the sparsity parameter k. As a consequence, the 
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bound enables suitable choices of k = k(d) of the sparsity parameter depending on the problem 
size. 

For example, typical values 

' 0.05c/ in 2D, 
0.05d 2 in 3D, 



k 



(5.16) 



chosen by engineers^ in applications according to a rule of thumb, result in 

2 exp ( - |^ 2 ) in 2D, 
Ty?<-~) in 3D. 



Pv (\X - N° R \ > 5) < 



2exp( -m 2 



(5.17) 



For the 3D case ( |5.16[ ), the probability to observe deviations from N R larger than 1% drops 
below 0.01 for problem sizes d > 77, which is common in practice. 

Thus, the bound ( |5.15| ) is strong enough to indicate not only that ( |5.16[ ) is a particular sensible 
choice, but also leads to more proper choices of k for applications, which still give highly 
concentrated values of observations of N R . This is the essential prerequisite for threshold 
effects of unique recovery from sparse measurements. 

Proof ( Proposition \5. 31 ). Let i?°_ x C R denote the subset of rays with zero measurements after 
the random selection of i — 1 < k cells. For the remaining k — (i — 1) trials, the probability 
that not any cell incident with some ray r E R®_i will be selected, is 

^- (i " 1) =E[X r |^ 



i-lh 



(5.18) 



with pd given by ( |5.1 1[ ). Consequently, by linearity, the expectation Yj_i of zero measurements 
given zero measurements after the selection of i — 1 cells, is 



Yi-x = E[X\ Ti-y 



Efc-(i-l) 
Pd 



(5.19) 



Now suppose we observe the random selection of the i-th cell. We distinguish two possible 
cases. 

(1) Cell i is not incident with any ray r e R®-\- Then the number of zero measurements 
remains the same, and 



Yi 



Furthermore, 

Yi — Yi_i 



p k -^) = \Rl l \p*-\l-p d ) 



reRU (5.21) 
<(\R\- lJpJ-V 

(2) Cell i is incident with 1, . . . , D rays contained in R^. Let R® denote the set R®_i after 
removing these rays. Then 

^ = E rf- ■ 



'Personal communication. 
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Figure 7. The expected number Nc = E[|Cb|] of cells supporting observed 



measurement vectors b, given by ( |5.23[ ). Starting with rate Nc oc k for very 
small values of k, it quickly increases and exceeds N R (Fig. [6]), thus leading to 
underdetermined reduced systems (|5.1[). 



Furthermore, since R? C R^ and |i2g_ x \ Rf\ < D, 



fc-(i-l) 



) 



(5.22) 



refl?_i\*? 

<^r +i -5>r(i-p,)<^; 

Comparing the bounds ( |5.21[ ) and ( |5.22[ ), we have with = -D, 

(Ifll-lJgdPS-^p-gd)^"*. D PdP k d - 1 = (D - Dq d ) Pd . 

Thus, we take the larger bound ( |5.21[ ), drop the immaterial —1 in the first factor and compute 

k 1 _ ^2fc 



Jt—i 



1 -Pd 



^2 - 



Inserting from ( |5.1 1[ ) and expanding in terms of d 1 at 0, we obtain 

l- 1 pf = (k + (k-k 2 )d- 1 + 0(d- 2 ), in 2D 
1-P 2 d ~ \k + (k-k 2 )d- 2 + 0(d- 4 ) 1 in 3D 

Applying Theorem |2 . 1 1 completes the proof. 



> k. 



□ 



5.1.3. Expected Number of Cells. In the previous section, we computed the expected number 
of measurements N R = E[| supp(fe) |] induced by a random unknown A;-sparse vector x (Lemma 
5T2l) along with a tail bound for N R = \R\ — N R (Prop. [53k 



In the present section, we determine the expected number of cells corresponding to N R , 
denoted by N c . We confine ourselves to the practically more relevant 3D case. 

As in the previous section, X E {0, 1}'^' denotes a random vector indicating subsets of 
projection rays. X r — 1, r e R, corresponds to a zero observation along ray r. For a subset of 



rays Rb C R, we say that the corresponding subset of cells Cb in ( |5.2[ ) supports Rb. 



Proposition 5.4. For a given value of the sparsity parameter k, the expected size of subsets of 
cells that support random subsets Rb C R of observed non-zero measurements, is 



N, 



c 



N c (k) =d 3 (i~ 3(1 +3(1 



2d - l\ k 



d* 



- 1 



3ci — 2\ ^ 



(5.23) 
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Proof. We partition the set of rays R = R\ U R2 U R3 according to the three projection images 
(Fig. [2]) and associate with the cells C the corresponding set of triples of projection rays 

#1,2,3 = {(ri,r 2 ,r 3 ): n? =1 r< ^ 0, r, e R u 1 = 1,2,3}, 

with each triple intersecting in a single cell. Thus, we have | ^1,2,3 1 = |C| = d 3 , and each cell 
Cijk = r i H Tj fl r k belongs to the set C b supporting R b if R b fl (7^ U rj U r fe ) 7^ 0. In terms of 
random variables X r indicating zero-measurements by X r = 1, this means that G Q, if not 

X n = X r . = 

N C = E 



X rk = 1. Thus, 



J2(l-X ri )(l-X r2 )(l-X r 



Rl,2,3 

= ^ (* - ( E [ X rJ + E[X r2 ] + E[X r3 \) + ^[X n X rj ] - E[X ri X r2 X r 

Ki,2,3 l<i<j<3 

This expression takes into account the intersection of projection rays r^, rj (inclusion-exclusion 
principle) in order not to overcount the number of supporting cells. 

We have E[X r .} = p k d = (1 - d~ 2 ) k by d5T2l and ( |5TTT ). The event X n X r 
both rays correspond to zero measurements, which happens with probability 



1 means that 



1 - 



r,- U r 



ICI 



1 - 



2rf 



d 3 



We have three pairs of sets of rays from R = Ri U R2 U -R3, and each of the d 2 rays e Ri 
intersects with d rays rj E Rj. Finally, three intersecting rays correspond to zero measurements 
with probability 



1 7*1 U T2 U rs\ 

ici 



1 



3d -2 



for each of the d 3 cells c e C. 



□ 



Remark 5.3. Note that specifies the expected value of n re d in ( |5.3[ ) induced by random 
fc-sparse vectors x6lj,. See FigurefTlfor an illustration. 



5.1.4. Overdetermined Reduced Systems: Critical Sparsity k. For small value of k, that is 
for highly sparse scenarios, the expected value Nn(k) 3k grows faster than Nc(k) ~ k. 
Consequently, the expected reduced system due to Definition 5.1 will be overdetermined. This 
holds up to a critical value k < k crit because for increasing values of k, it is more likely that 
several particles are incident with some projection ray, making iV^ increasing faster than N R . 



Proposition 5.5. For k < k cr u, the reduced system ( |5.1[ ) will be overdetermined with high 
probability, where k crit solves 

N R (k crit ) = N c {k mt ) (5.24) 
and Nii(k cr it), Nc{k cr it) are given by ( |5.13[ ) and ( |5.23[ ). 

Figure M shows the dependency k crit = k crit (d) on the problem size d, as defined by (]5.24|) . 



5.2. Unperturbed Systems. We consider the recovery properties of the 3D setup depicted in 
Fig. [2j based on Theorem 3.4 and on the expected quantities involved in the corresponding 
condition ( |3.8[ ), as worked out in Section 5.1| Concerning the interpretation of the following 
claims, we refer to Remark [5T1 



Proposition 5.6. The system Ax = b, with measurement matrix A given by ( |2.2[ ), admits unique 
recovery ofk-sparse non-negative vectors x with high probability, if 



k < 



N c (ks) = 1 
1 + 5 ~ 35(1 + 5) 



N R (k s 



5 > 



v/5 



(5.25a) 
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Figure 8. Values k crit = k crit (d) of the sparsity parameter such that k < k, 



crit 



yield overdetermined reduced systems ( |5.1[ ). For the depicted and practically 
relevant range of d, the slope of the log-log curve slightly decreases in 
1.65... 1.55. 

where kg solves 

N R (k s ) = 35N c (k s ) 
and Nn(k),Nc(k) are given by ( |5.13[ ) and ( |5.23[ ). 



(5.25b) 



Proof. The assertion follows from replacing the quantities forming condition ( |3.8[ ) by their 
expected values, due to Remarks |5.2| and |5.3[ □ 



Remark 5.4. Equation (5.25b) shows that unique recovery of a A;-sparse, k < 



(1+8)- 



non- 



negative vector can be expected using the unperturbed measurement matrix provided the re- 
duced system ( |5.1[ ) is by a factor m re d > 1.854 n re d overdetermined. See Figure [9] for an 
illustration. 

5.3. Perturbed Systems. Analogously to the previous section, we evaluate the average recov- 



ery performance using perturbed systems based on Theorem 3.5 



Proposition 5.7. The system Ax = b, with perturbed measurement matrix A given by ( |2.2| ), 
admits unique recovery of k-sparse non-negative vectors x with high probability, if k satisfies 
condition k < k crit from Prop. 5.5 that is, if the reduced system (5.1 ) is overdetermined. 



Proof. Immediate from Theorem |3.5[ replacing the quantities forming condition p.9[ ) by their 
expected values, and taking into account I = 3 for the measurement matrix ( |2.2[ ) and the case 
D = 3. □ 



Remark 5.5. In view of this assertion and Remark 5.4 it is remarkable that a significant gain 
of recovery performance can be obtained by a simple device: structure -preserving perturbation 
of the measurement matrix. See Figure [9] for an illustration. 

5.4. Underdetermined Perturbed Systems. Based on ( |2.7| ) and the average case analysis of 



condition p.9[ ) (Section 5.1 ), we devise a criterion for determining the maximal sparsity value 
k (minimal sparse scenario), such that any A;-sparse vector x can be uniquely recovered with 



high probability using the measurement matrix A given by ( |2.2[ ). Unlike Propositions 5.6 and 
5.7[ we specifically consider here less sparse scenarios that result in underdetermined reduced 
systems (|5 . 1 h. 



Proposition 5.8. Let A be a matrix satisfying the assumptions of Lemma 3.3 with tq 
N R {k max ), where k max solves 



N R (k„ 



5N c (k r , 



S > 



(5.26) 
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Figure 9. Critical upper bound sparsity values k = k(d) that guarantee 
unique recovery of fc-sparse vectors x on average with high probability. From 
bottom to top: k$ (|5.25a[) for unperturbed matrices A, k crit (|5.24[) resulting in 



overdetermined reduced systems, k max ( |5.27[ ) for underdetermined perturbed 
matrices A, and fully random measurement matrices. 



with N R (k), N c {k) given by ( |5.13[ ) and ( |5.23[ ). Then a k-sparse vector x can be uniquely 
recovered with high probability, if 

N R(k max ) 



k< k ■n 



(5.27) 



Proof. By assumption and Lemma 3.3[ Theorem 3.2 (see also Remark 3.1[ ) implies ( |5.27| ), 
thereby taking into account that Eqn. ( |5.26[ ) defining k max reflects the expected version of 
condition ( |3.8| ), subdivided by the factor 3 due to ( |5.27[ ). □ 

[fT9l l5llT3l Figure [9] illustrates the value k max ( |5.27[ ) and compares it to the previous results. 



Finally, we comment on the uniqueness condition established in |fP3l which corresponds 
to the top k(d) curve in Figure [9j This result does not apply to our setting. The reason is 
that a basic assumption underlying the application of ( |2.7[ ) does not hold. While after some 
perturbation the points corresponding to the columns of A and the sparsity value |I~(:e)| = k 
are in general position, the underlying distribution lacks symmetry with respect to the origin. 
As a result, we cannot establish the superior performance of "fully" random sensors considered 
in 031. 

5.5. Two Cameras are Not Enough. In the present section, we briefly discuss how the pre- 
viously obtained bounds on sparsity apply in the 2D scenario. To this end, we first compute 
the expected value of nonempty cells connected to Rb measurements generated by a A; sparse 
nonnegative vector. 

Proposition 5.9. In 2D, the expected size of subsets of cells that support random subsets Rb C 
R of observed non-zero measurements, is 

2 



N c = N c (k) 
for a given sparsity parameter k, 



d 2 1 



1 



d 



(5.28) 



Proof. We partition the set of rays R = RiUR 2 according to the two projection images (Fig. [2]), 
left, and associate with the cells C the corresponding set of pairs of projection rays 

Ri,2 = {(ri,r 2 ): n? =1 r, ^ 0, r 4 G Ri, i = 1,2}, 
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with each pair intersecting in a single cell. Thus, we have l-Ri^l = \C\ = d 2 , and each 
cell Cij = Ti fl Tj belongs to the set C& supporting R b if R b n (r* U r,) 7^ 0. In terms of 
random variables X r indicating zero-measurements by X r 
X r . = X r . = 1. Thus, 



1, this means that c*.- G C& if not 



X, 



E[^(l-X n )(l-X r2 

^(l-(E[X ri ]+E[X r ,])+ £ E[X r X r J] 



i<*<i<2 



taking the intersection of projection rays r i} Tj into account. We obtained E[X r .] 
(1 



Pd 



\) k in (|5.12|) and (|5.11|>. The event that both rays correspond to zero measurements 



X r ., X r 



1 happens with probability 



TV U Ti 



\c\ 



2d - 1 \ * 



d 2 



l X 2fe 

7l 



□ 



By Prop. 5.9 and Lemma 5.2 we can now compute the the expected ratio of the dimensions of 
the reduced system, further denoted by c. We solve the polynomial Nn(k) = cNc(k) according 
to and ( |5.28[ ). Interesting are the values c G {25, 1,5, |}. For example, if c = 25, we obtain 
guaranteed recovery of all 1-sparse vectors, which also equals the strong threshold for the 2D 
case. If c = 1, we obtain, on average, that any fc-sparse vector x, with 



(5.29) 



induces reduced reduced overdetermined systems. Thus two particles can always be recon- 
structed, after perturbation. If c = \ the critical sparsity value approximately equals 4 for 
arbitrary d. This is the best achievable bound, which is obviously useless for application. For 
k = 3 it can be shown that the probability of correct recovery via the perturbed matrix A 2 , is 



2-4- fflffl+4.ffl ! 
(?) 



d 2 + 6d - 10 d- 
3{d 2 - 2) 



J -> 1/3 . 



We mention that the expected relative values of N R and N c do not vary much with different 
two camera arrangements. This highly pessimistic results can be explained by the fact that 
there is no expander with constant left degree t less than 3. 



6. Numerical Experiments and Discussion 

In this section we empirically investigate bounds on the required sparsity that guarantee 
unique nonnegative or binary A;- sparse solutions. 

6.1. Reduced Systems versus Analytical Sparsity Thresholds. The workhorse of the previ- 



ous theoretical average case performance analysis of the discrete tomography matrix from ( |4.1[ ) 
is the derivation of the expected number of nonzero rows Nu(k) induced by the A;-sparse vector 
along with the number Nc(k) of "active" cells which cannot be empty. This can be done also 
empirically, see Fig. [10} left, for the 2D case and right, for the 3D case. To generate the figures 
we varied k G {1, 2, • • • , 2000} and d G {10, 11, • • • , 100} in 2D and k G {1, 2, • • • , 2000} 
and d G {10, 11, • • • , 100} in 3D, respectively, and generated for each point [k, d) 500 problem 
instances. The plots show N R (k, d) /N c (k, d) along with the curves: k s ( |5.25a[ ) for unperturbed 
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Figure 10. The contourplots of the average fraction of the reduced systems as 
a function of the resolution parameter d and the sparsity parameter k. Left: In 
agreement with the results of Section 5.5, the plots for the 2D case show that 
level lines of N R (k, d)/Nc(k, d) are constant with varying d. Right: In 3D the 
situation dramatically changes. Higher sparsity values are allowed for increas- 
ing values of d, as the derived threshold curves show. Below the blue curve k§ 



( |5.25a| ) reconstruction for unperturbed systems is guaranteed with high probabil- 
ity. Below the dashed red curve k crit ( |5.24[ ) reduced systems are overdetermined. 
For points below the solid red curve k max ( |5.27[ ) reconstruction is guaranteed for 



perturbed systems. Finally, problem instances under the green curve k opt ( |6.1| ) 
could be recovered if the reduced matrices would follow a symmetrical distri- 
bution with respect to the origin. 



matrices A, k crit ( |5.24[ ) resulting in overdetermined reduced systems, k r , 
termined perturbed matrices A, and k opt which solves 



(5.27) for underde- 



N R (k opt ) = 0.5N c (k opt ) 



(6.1) 



6.2. Empirical Phase Transitions. We further concentrate on the 3D case. In analogy to ifTOl 
we assess the so called phase transition p as a function of d, which is reciprocally proportional 
to the undersampling ratio — G (0, 1). We consider d G {10, 11, ... , 100}, the corresponding 
matrix A^ g ]R 3d2xd3 from ( |4.1[ ) and its perturbed version A and the sparsity as a fraction of d 2 , 
k = pd 2 , for p G (0, 1). 

This phase transition p(d) indicates the necessary relative sparsity to recover a A;-sparse solu- 
tion with overwhelming probability. More precisely, if ||x||o < p(d) ■ d 2 , then with overwhelm- 
ing probability a random A;- sparse nonnegative (or binary) vector x* is the unique solution in 
J-+ := {x: Ax = Ax*,x > 0} or .Fo,i : = Ax = Ax*,x G [0, l] n }, respectively. Unique- 
ness can be "verified" by minimizing and maximizing the same objective f T x over T+ or T^\, 
respectively. If the minimizers coincide for several random vectors / we claim uniqueness. As 



shown in Fig. 12 the threshold for a unique nonnegative solution and a unique 0/1 -bounded 



solution are quite close. 

To generate the success and failure transition plots we generated A according to ( |4.1[ ) and 
A by slightly perturbing its entries and varying d G {10, 11, ... , 100} A has the same sparsity 
structure as A, but random entries drawn from the standard uniform distribution on the open 
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interval (0.9, 1.1). We have tried different perturbation levels, all leading to similar results. 
Thus we adopted this interval for all presented results. 

Then for p E [0, 1] a pcP-sparse nonnegative or binary vector was generated to compute 
the right hand side measurement vector and for each (d, p) -point 50 random problem instances 
were generated. A threshold-effect is clearly visible in all figures exhibiting parameter regions 
where the probability of exact reconstruction is close to one and it is much stronger for the per- 
turbed systems. The results are in excellent agreement with the derived analytical thresholds. 
We refer to the figure captions for detailed explanations. Finally, we refer to the summary in 



Figure 1 1 for the computed sharp sparsity thresholds, which are in excellent agreement with 
our numerical experiments. 




1500 



FIGURE 1 1 . Relative critical upper bound sparsity values k(d) in the practi- 
cal relevant domain d 6 (500, 1500) that guarantee unique recovery of /c-sparse 
vectors x on average with high probability. From bottom to top: k$ ( |5.25a| ) 
for unperturbed matrices A (blue line), k crit ( |5.24[ ) resulting in overdetermined 
reduced systems (dashed red line), k max ( |5.27[ ) and k max ( |5.26| ) for underdeter- 



mined perturbed matrices A (solid red and pink line), and ideal random measure- 
ment matrices k opt (green line). The thin black line depicts the particle density 
used by engineers in practice, while the black spot corresponds to the typical 
resolution parameter d = 1024. The results demonstrate that specific slight ran- 
dom perturbations of the TomoPIV measurement matrix considerably boost the 
expected reconstruction performance by at least 150%. 



7. Conclusions 

The main contribution of this work is the transfer of recent results on compressive sensing 
via expander graphs with bad expansion properties to the discrete tomography problem. In 
particular, we consider a sparse binary measurement matrix, which encodes the incidence re- 
lation between projection rays and image discretization cells, along with its slightly perturbed 
counterpart. While the expected expansion of the underlying graph does not change with per- 
turbation, the recovery performance can be boosted significantly. We investigate the average 
performance in recovery of exact sparse nonnegative signals by analyzing the properties of re- 
duced systems obtained by eliminating zero measurements and related redundant discretization 
cells. We compute sharp sparsity thresholds, such that the maximal sparsity can be determined 
precisely for both perturbed and unperturbed scenarios. Our theoretical analysis suggests that 
a similar procedure can be applied to different geometries. 
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Figure 12. Left: Recovery via the unperturbed matrix (blue curves), 
d G {10,20,30} (from top to down) versus the perturbed counterpart (red 
curves). The dash-dot line depicts the empirical probability (500 trials) that 
reduced systems are overdetermined and have full rank. The solid line (blue: 
unperturbed, red: perturbed) shows the probability that a /c-sparse nonnegative 
vector is unique. The dashed curve shows the probability that a /c-sparse binary 
solution is the unique solution of in [0, l] n . Additional information like binarity 
gives only a slight performance boost. The curve k$ ( |5.25a| ) correctly predicts 
that 18 (d = 10), 48 (d = 20), and 85 (d = 30) particle are reconstructed with 
high probability via the unperturbed systems and 66 (d = 10), 181 (d = 20), 
328 (d = 30) particles, via the perturbed systems according to k max \5.21) . 
However, 105 (rf = 10), 241 (d = 20), 408 (d = 30), by k max from ( gg6) ) 
are more accurate. Division by three does not seem to be necessary. Right: 
Empirical probability obtained from 10000 trials that k random columns of the 
unperturbed matrix (solid black line) or of the perturbed matrix (dashed black 
line) are linearly independent. 
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Figure 1 3 . Left: Success and failure empirical phase transitions for unper- 
turbed and perturbed systems right. Top: Probability that the reduced matrices 
are overdetermined and of full rank, along (right) with the estimated relative 
critical sparsity level kk r u (dashed red line) which induces overdetermined re- 
duced matrices. Middle: Probability of uniqueness of a A; 



pd 2 sparse nonneg- 



ative vector. Bottom: Probability of uniqueness in [0, l] n of a k = pd 2 sparse 
binary vector. The blue curve depicts again k$ ( |5.25a[ ), the dashed red curve 
k crit ( |5.24[ ), the solid red curve k max ( |5 .27 ), k max ( |5.27| ) and the green curve 
k op t (6.1). In case of the perturbed matrix A exact recovery is possible beyond 
overdetermined reduced matrices. Moreover k max follows most accurately the 
empirical phase transition for perturbed systems. 
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